Ligand-Capped Cobalt(II) Multiplies the Value of the Double-Histidine Motif for PCS NMR Studies

In structural studies by NMR, pseudocontact shifts (PCSs) provide both angular and distance information. For proteins, incorporation of a di-histidine (diHis) motif, coordinated to Co2+, has emerged as an important tool to measure PCS. Here, we show that using different Co(II)-chelating ligands, such as nitrilotriacetic acid (NTA) and iminodiacetic acid (IDA), resolves the isosurface ambiguity of Co2+-diHis and yields orthogonal PCS data sets with different Δχ-tensors for the same diHis-bearing protein. Importantly, such capping ligands effectively eliminate undesired intermolecular interactions, which can be detrimental to PCS studies. Devising and employing ligand-capping strategies afford versatile and powerful means to obtain multiple orthogonal PCS data sets, significantly extending the use of the diHis motif for structural studies by NMR.


Protein expression and purification
Coding sequences for diHis GB1 (T2Q, K28H, Q32H) and diHis CA CTD dimer (144-231, D166H/K170H) were inserted into pET11a and pET41 vectors, respectively. DiHis GB1 and diHis CA CTD dimer were expressed in BL21 star (DE3) cells in M9 medium containing 1g/L 15 NH4Cl. For 5F-Trp or 7F-Trp labeling, 20 mg/L of 5F or 7F-indole (Sigma-Aldrich, St. Louis, MO, USA) were added at OD600 = 0.6, respectively, followed by induction after 20 mins with 0.5 mM isopropyl β-D-thiogalactopyranoside (IPTG) (Sigma-Aldrich, St. Louis, MO, USA) and subsequent growth at 18 °C for 16 h. Cells were harvested by centrifugation at 4000 × g for 10 mins and lysed by sonication for 10 mins (2s on and 3s off with 50% power level) in 50 mL QA buffer (20 mM Tris, pH 8.0). The cell lysate was clarified by centrifugation at 18,000g for 30 mins and the supernatant was fractionated on a HP Q column (GE Healthcare, Chicago, IL), followed by gel filtration on a Superdex 75 column (GE Healthcare, Chicago, IL) equilibrated in NMR buffer (20 mM HEPES, 150 mM NaCl, 0.2% NaN3, pH 7.0). DiHis CA CTD dimer (144-231) was expressed and purified as described previously 1 . In brief, the cell lysate was subjected to fractionation over a HP SP column, followed by gel filtration over a Superdex 75 column. The purity and molecular masses of the proteins were confirmed by SDS-PAGE and ESI mass spectroscopy, respectively.

Preparation of protein-metal complexes
NTA (nitrilotriacetic acid) and IDA (iminodiacetic acid) were premixed with 100 mM ZnSO4, CoCl2 in NMR buffer at ratios of 1.5:1 and 10:1. The pH of the solutions with the metal complexes was adjusted to 7.0 using 10 M NaOH or 37 % HCl to minimize pH changes during the titrations. To measure intra-/inter-subunit PCSs for the CA CTD dimer, mixed isotopically labeled proteins were used. Two samples were prepared: 10 equivalents of non-labeled, natural abundance diHis CA CTD dimer and one equivalent of 15 N, 7F-Trp WT CA CTD dimer were mixed as well as vice versa. This allowed measurement of only intra-or inter-subunit PCSs.

NMR spectroscopy
All 19 F spectra were recorded on a 14.1 T Bruker AVANCE spectrometer, equipped with a CP TXO F/C-H-D tripleresonance, z-axis gradient cryoprobe at 283 K. 19 F chemical shifts were referenced with respect to trifluoracetic acid. 19 F spectra were collected with 4,096 data points and a spectral width of 20 ppm using a recycle delay of 1.5 s. The carrier frequency was set to -123 ppm and -133 ppm for 5F-Trp diHis GB1 and 7F-Trp diHis CA-CTD dimer, respectively. 1 H-15 N HSQC spectra were recorded for diHis GB1 coordinated with Co 2+ /Zn 2+ , NTA-Co 2+ /Zn 2+ and IDA-Co 2+ /Zn 2+ and diHis CA CTD dimer as well as two mixed CTD dimers (see above) coordinated with Co 2+ /Zn 2+ , NTA-Co 2+ /Zn 2+ with an interscan delay of 1s and 128 complex points in 15 N dimension for the 1 H PCS measurements. For long-range 1 H-15 N HMBC spectra, the 15 N carrier frequency was centered at 220 ppm and the J coupling constant was optimized to 22.7 Hz for the detection of histidine imidazole cross-peaks. Spectra were recorded with 200 scans and 90 complex points in the 15 N dimension. 19 F NMR titrations were performed to determine the binding affinities (Kd) between Co 2+ or NTA-Co 2+ and diHis GB1. Kd values were extracted from curve fitting of the bound and free populations as a function of the ratio (r) of protein and ligand concentration: fB and fA denote the bound and free fraction of the protein during the titration and were calculated from the respective integrated peak areas in the 19 F spectra. c is the total protein concentration (200 µM) used in NMR titration experiments and was fixed for curve fitting. The Kd between IDA-Co 2+ and diHis GB1 was calculated to be 212 µM, based on the relative population of diHis GB1 and IDA-Co 2+ -diHis GB1after adding 0.5-eq IDA-Co 2+ (mixed at 5:1), as more IDA ligand will be present at higher molar ratio of IDA (5:1) to diHis GB1 and compete off IDA-Co 2+ binding with diHis GB1.
Data were processed in Topspin and analyzed in NMRFAM-Sparky 3 , and the alignment tensors A were determined using AF2 diHis GB1 structure in Paramagpy 4 and converted to Dcax,rh values according to the equation below: where B0 is the strength of the magnetic field, µ0 the vacuum permeability, kB the Boltzmann constant and T the temperature.

F R2 and relaxation dispersion measurements and analysis
19 F R1 and R2 rates were measured by inversion recovery 5 and CPMG 6-7 , respectively, using a recycle delay of 2 s. Data processing and analysis were performed in Topspin (Bruker) and MestReNova. R2 Relaxation rates were obtained by fitting the intensity changes to single exponential functions (I(t) = I0*exp(-R2*t)). 19  of Dc-tensor were determined from 20 iterations of Dc-tensor fit in which 20% 1 H PCSs were discarded randomly. The map size, map density and RMSD contour level in the script was set to be 20.0 Å, 4 points/Å and 0.02 ppm.

Computational modelling of protein-metal complex
Quantum mechanical geometry optimizations were carried out for the NTA-Zn 2+ diHis coordination complex. The histidine coordinates were taken from residues 28 and 32 of the AF2 diHis GB1 model and truncated to contain only the histidine sidechains, including Cα and Cβ atoms. The initial NTA-Zn 2+ coordinates were taken from the crystal structure 11 .
The diHis motif with Cα and Cβ atoms of the truncated histidine residues, constrained to fixed positions, was combined with NTA-Zn 2+ and energy minimized in Avogadro 12 to form a starting structure using the following histidine coordination states: the Nd1-H tautomer for His32 and Ne1-H tautomer for His28, the Ne1-H tautomer for His32 and Nd1-H tautomer for His28, and the Nd1-H tautomer for both His32 and His 28.
DFT-based geometry optimization was performed in ORCA 5.0 13 using the BP86 functional 14 , the def2-SVP double-zeta basis set 15 , the RI-J approximation 16  The initial NTA-Co 2+ coordinates were taken from the crystal structure 21 and the torsion angles of the histidine residues of the diHis motif were adjusted to match those of the DFT optimized diHis NTA-Cu 2+ structures 22 . Simulations of CA protein systems were carried out using the PMEMD module in the AMBER 20 software package 23 . Standard protein residues as well as non-standard fluorinated residues 24 were treated using the ff15ipq force field 25 . The GAFF 2 forcefield 26 was used for NTA ligand parameters and AM1-BCC partial atomic charges for the ligands were calculated using Antechamber 27 . The Co 2+ ion in each system was treated using a 12-6-4 LJ-type non-bonded model 28 . Each system was solvated in a truncated octahedral box of explicit SPC/Eb 29 water molecules with at least a 12 Å clearance between the protein and the edges of the box. All systems with unpaired charges were first neutralized with Na + or Clions, treated with Joung and Cheatham ion parameters 30 , before saturating with enough Na + and Clions to reach a 150 mM NaCl concentration. Protonation states for ionizable residues were adjusted to represent the major species present at pH 7.0. Each system was subjected to energy minimization followed by a two-stage solvent equilibration. In the first equilibration stage, a 20 ps simulation was carried out at constant volume and temperature in the presence of solute heavy-atom positional restraints using a harmonic potential with a force constant of 1 kcal/(mol Å 2 ). In the second stage, a 1 ns simulation was carried out at constant temperature and pressure using the same harmonic positional restraints. Temperatures were maintained at 283K using a Langevin thermostat with a frictional constant of 1 ps -1 , while pressure was maintained at 1 atm using a Monte Carlo barostat with 100 fs between system volume changes. Van der Waals and short-range electrostatic interactions were truncated at 10 Å, while long-range electrostatic interactions were calculated using the particle mesh Ewald method 31 . To enable a 2 fs time step, all CH and NH bonds were constrained to their equilibrium values using the SHAKE algorithm 32 . The structure models of AlphaFold2 diHis GB1 and NTA-Zn 2+ coordinated diHis GB1 as well as the NTA-Co 2+ coordinated diHis CA CTD model are available from the authors upon request or can be downloaded from the Github website (https://github.com/darianyang/gb1-pcs).                *The quality factor in the brackets correspond to the fit after excluding data points colored in red in Figure 1c.